Multi-Agent Architect

Design systems where multiple specialized agents collaborate to solve complex problems.

Core Principle

Divide complex tasks among specialized agents

, each expert in their domain, coordinated through clear communication patterns.

When to Use Multi-Agent Systems

Use Multi-Agent When:

✅ Task requires multiple specializations (research + writing + coding)

✅ Parallel processing speeds up solution (independent subtasks)

✅ Need self-correction through peer review

✅ Complex workflows with decision points

✅ Scaling single-agent becomes unwieldy

Don't Use Multi-Agent When:

❌ Single agent can handle task efficiently

❌ Task is simple and linear

❌ Communication overhead > parallelization benefit

❌ Team lacks multi-agent debugging expertise

Multi-Agent Patterns

Pattern 1: Sequential Pipeline

Use

Multi-step workflow where each agent builds on previous

User Query → Researcher → Analyst → Writer → Editor → Output

Example

Research report generation

Researcher: Gather sources

Analyst: Synthesize findings

Writer: Draft report

Editor: Refine and format

Pros

Clear dependencies, easy to debug

Cons

Sequential (no parallelization), bottlenecks

Pattern 2: Hierarchical (Manager-Worker)

Use

Complex task broken into parallel subtasks

Manager Agent

/ | \

Worker 1 Worker 2 Worker 3

(Search) (Analyze) (Summarize)

\ | /

Aggregator Agent

Example

Market research across competitors

Manager: Decompose into per-competitor analysis

Workers: Research competitor A, B, C in parallel

Aggregator: Combine findings

Pros

Parallelization, specialization

Cons

Manager complexity, coordination overhead

Pattern 3: Peer Collaboration (Round Table)

Use

Multiple perspectives improve quality

Coder ↔ Reviewer ↔ Tester

↓ ↓ ↓

Consensus

Example

Code generation with review

Coder: Write initial code

Reviewer: Check for issues

Tester: Validate functionality

Iterate until consensus

Pros

Quality through review, self-correction

Cons

May not converge, expensive (multiple LLM calls)

Pattern 4: Agent Swarm

Use

Many agents explore solution space independently

Agent 1 → Candidate Solution 1

Agent 2 → Candidate Solution 2

Agent 3 → Candidate Solution 3

↓

Selector (pick best)

Example

Creative brainstorming

5 agents generate different approaches

Selector evaluates and picks best

Pros

Exploration, creativity
Cons: Cost (N agents), may produce similar solutions Communication Patterns 1. Shared Memory shared_state = { "research_findings" : [ ] , "current_task" : "analyze_competitors" , "decisions" : [ ] }

All agents read/write to shared state

researcher

.

execute

(

shared_state

)

analyst

.

execute

(

shared_state

)

Pros

Simple, all agents see full context
Cons: Race conditions, hard to debug who changed what 2. Message Passing

Agent A sends message to Agent B

message

{

"from"

:

"researcher"

,

"to"

:

"analyst"

,

"content"

:

research_findings

,

"metadata"

:

{

"confidence"

:

0.9

}

message_queue

.

send

(

message

)

Pros

Clear communication flow, traceable
Cons: More complex to implement 3. Event-Driven

event_bus . subscribe ( "research_complete" , analyst . on_research_complete ) event_bus . subscribe ( "analysis_complete" , writer . on_analysis_complete )

Agent publishes event when done

event_bus

.

publish

(

"research_complete"

,

research_data

)

Pros

Loose coupling, scalable

Cons

Harder to follow execution flow

Agent Coordination Strategies

1. Fixed Workflow

Predefined sequence, no dynamic decisions

workflow

=

[

(

"researcher"

,

gather_info

)

,

(

"analyst"

,

analyze_data

)

,

(

"writer"

,

create_report

)

]

for

agent_name

,

task

in

workflow

:

result

=

agents

[

agent_name

]

.

execute

(

task

,

context

)

context

.

update

(

result

)

Use

Predictable tasks, clear dependencies

2. Dynamic Routing

Manager decides next agent based on context

class

ManagerAgent

:

def

route_task

(

self

,

task

,

context

)

:

if

requires_technical_expertise

(

task

)

:

return

tech_specialist

elif

requires_creative_input

(

task

)

:

return

creative_agent

else

:

return

generalist

Use

Tasks vary significantly, need flexibility
3. Consensus-Based
Agents vote or reach agreement
proposals
=
[
agent
.
propose_solution
(
task
)
for
agent
in
agents
]
scores
=
[
agent
.
evaluate
(
proposals
)
for
agent
in
agents
]
best
=
proposals
[
argmax
(
mean
(
scores
)
)
]
Use: High-stakes decisions, quality critical Implementation with CrewAI CrewAI Pattern (Role-based teams): from crewai import Agent , Task , Crew

Define specialized agents

researcher

Agent ( role = "Research Specialist" , goal = "Gather comprehensive information on {topic}" , backstory = "Expert researcher with 10 years experience" , tools = [ search_tool , scrape_tool ] ) analyst = Agent ( role = "Data Analyst" , goal = "Synthesize research findings into insights" , backstory = "Data scientist specialized in trend analysis" , tools = [ analysis_tool ] ) writer = Agent ( role = "Technical Writer" , goal = "Create clear, compelling reports" , backstory = "Professional writer with technical expertise" , tools = [ writing_tool ] )

Define tasks

research_task

Task ( description = "Research {topic} thoroughly" , agent = researcher , expected_output = "Comprehensive research findings with sources" ) analysis_task = Task ( description = "Analyze research findings for key insights" , agent = analyst , context = [ research_task ] ,

Depends on research_task

expected_output

"List of key insights and trends" ) writing_task = Task ( description = "Write executive summary based on analysis" , agent = writer , context = [ research_task , analysis_task ] , expected_output = "500-word executive summary" )

Create crew and execute

crew

Crew ( agents = [ researcher , analyst , writer ] , tasks = [ research_task , analysis_task , writing_task ] , verbose = True ) result = crew . kickoff ( inputs = { "topic" : "AI market trends" } ) Implementation with LangGraph LangGraph Pattern (State machines): from langgraph . graph import StateGraph , END class AgentState ( TypedDict ) : input : str research : str analysis : str output : str def research_node ( state ) : research = researcher_agent . run ( state [ "input" ] ) return { "research" : research } def analysis_node ( state ) : analysis = analyst_agent . run ( state [ "research" ] ) return { "analysis" : analysis } def writing_node ( state ) : output = writer_agent . run ( state [ "analysis" ] ) return { "output" : output }

Build graph

workflow

StateGraph ( AgentState ) workflow . add_node ( "research" , research_node ) workflow . add_node ( "analysis" , analysis_node ) workflow . add_node ( "writing" , writing_node ) workflow . set_entry_point ( "research" ) workflow . add_edge ( "research" , "analysis" ) workflow . add_edge ( "analysis" , "writing" ) workflow . add_edge ( "writing" , END ) app = workflow . compile ( )

Execute

result

app . invoke ( { "input" : "Analyze AI market trends" } ) Best Practices 1. Clear Agent Roles Each agent should have specific expertise and responsibilities 2. Minimize Communication More agents = more coordination overhead. Start simple. 3. Idempotent Operations Agents should be restartable without side effects 4. Failure Handling Design for agent failures (retry, fallback, skip) 5. Observable Execution Log agent decisions, trace execution flow 6. Cost Management Track token usage per agent, optimize expensive calls Common Multi-Agent Mistakes ❌ Too many agents → Start with 2-3, add only if needed ❌ Unclear responsibilities → Define explicit roles ❌ No failure handling → One agent failure breaks entire system ❌ Synchronous bottlenecks → Parallelize independent agents ❌ Ignoring costs → N agents = N× LLM calls ❌ Over-engineering → Single agent often sufficient Decision Framework: Single vs Multi-Agent Task Complexity? │ ├─ Simple, linear → Single Agent │ ├─ Complex, requires specialization? │ │ │ ├─ Sequential steps → Pipeline Pattern │ ├─ Parallel subtasks → Hierarchical Pattern │ ├─ Need review → Peer Collaboration │ └─ Explore solutions → Swarm Pattern │ └─ Uncertain → Start with Single Agent, refactor to Multi if needed Monitoring & Debugging

Track agent execution

class TrackedAgent ( Agent ) : def execute ( self , task , context ) : start = time . time ( ) logger . info ( f" { self . name } starting: { task } " ) result = super ( ) . execute ( task , context ) duration = time . time ( ) - start logger . info ( f" { self . name } completed in { duration } s" ) metrics . record ( { "agent" : self . name , "task" : task , "duration" : duration , "tokens" : result . token_count , "cost" : result . cost } ) return result Key Metrics : Agent execution time Token usage per agent Success/failure rates Handoff delays Overall workflow duration Related Resources

multi-agent architect

安装

All agents read/write to shared state

Agent A sends message to Agent B

message

Agent publishes event when done

Define specialized agents

researcher

Define tasks

research_task

Depends on research_task

expected_output

Create crew and execute

crew

Build graph

workflow

Execute

result

Track agent execution

安装

All agents read/write to shared state

Agent A sends message to Agent B

message

Agents subscribe to events

Agent publishes event when done

Define specialized agents

researcher

Define tasks

research_task

Depends on research_task

expected_output

Create crew and execute

crew

Build graph

workflow

Execute

result

Track agent execution